Systems Biology Approaches to Mining High Throughput Biological Data

نویسندگان

  • Fang-Xiang Wu
  • Min Li
  • Jishou Ruan
  • Feng Luo
چکیده

With advances in high throughput measurement techniques, large-scale biological data have been and will continuously be produced, for example, gene expression data, protein-protein interaction (PPI) data, tandem mass spectra data, microRNA expression data, lncRNA expression data, and biomolecule-disease association data. Such data contain insightful information for understanding the mechanism of molecular biological systems and have proved useful in diagnosis, treatment, and drug design for genetic disorders or complex diseases. For this focus issue, we have invited the researchers to contribute original research articles which develop or improve systems biology approaches to mining high throughput biological data. With high throughput data, it is appealing to develop systems biology approaches to understand important biological processes. In the paper " Differential Expression Analysis in RNA-Seq by a Naive Bayes Classifier with Local Normalization, " Y. Dou et al. developed a new tool for the identification of differentially expressed genes with RNA-Seq data, named GExposer. This tool introduced a local normalization algorithm to reduce the bias of nonran-domly positioned read depth. The Naive Bayes classifier was employed to integrate fold change, transcript length, and GC-content to identify differentially expressed genes. Results on several independent tests showed that GExposer had better performance than other methods. In the paper " K-Profiles: A Nonlinear Clustering Method for Pattern Detection in High Dimensional Data, " K. Wang et al. designed the nonlinear K-profiles clustering method, which can be seen as the nonlinear counterpart of the K-means clustering algorithm. The method had a built-in statistical testing procedure that ensures genes not belonging to any cluster do not impact the estimation of cluster profiles. Results from extensive simulation studies showed that K-profiles clustering outperformed traditional linear K-means algorithm. In addition, K-profile clustering generated biologically meaningful results from a gene expression dataset. Replicative senescence is of fundamental importance for the process of cellular aging. In the paper " Similarities in Gene Expression Profiles during In Vitro Aging of Primary Human Embryonic Lung and Foreskin Fibroblasts, " S. Diek-mann et al. elucidated cellular aging process by comparing gene expression changes, measured by RNA-Seq, in fibrob-lasts originating from two different tissues, embryonic lung (MRC-5) and foreskin (HFF), at five different time points during their transition into senescence. Their results showed that a number of monotonically up-and downregulated genes had a novel strong functional link to aging and senescence related processes. More and more studies have shown that many complex diseases are contributed jointly …

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reverse engineering biomolecular systems using -omic data: challenges, progress and opportunities

Recent advances in high-throughput biotechnologies have led to the rapid growing research interest in reverse engineering of biomolecular systems (REBMS). 'Data-driven' approaches, i.e. data mining, can be used to extract patterns from large volumes of biochemical data at molecular-level resolution while 'design-driven' approaches, i.e. systems modeling, can be used to simulate emergent system ...

متن کامل

Practical Applications of Data Mining

Despite the undoubted influence, technologies have made a tremendous change in the field of bioinformatics and other related areas. Extensive research is still being carried out on fundamentals of data mining in genomics and proteomics addresses about the recent research developments which really depends on the analysis and interpretation of large amounts of data generated by high-throughput te...

متن کامل

Mining Biological Repetitive Sequences Using Support Vector Machines and Fuzzy SVM

Structural repetitive subsequences are most important portion of biological sequences, which play crucial roles on corresponding sequence’s fold and functionality. Biggest class of the repetitive subsequences is “Transposable Elements” which has its own sub-classes upon contexts’ structures. Many researches have been performed to criticality determine the structure and function of repetitiv...

متن کامل

Classification of Information Fusion Methods in Systems Biology

Biological systems are extremely complex and often involve thousands of interacting components. Despite all efforts, many complex biological systems are still poorly understood. However, over the past few years high-throughput technologies have generated large amounts of biological data, now requiring advanced bioinformatic algorithms for interpretation into valuable biological information. Due...

متن کامل

Mathematical modeling of biological systems

Mathematical and computational models are increasingly used to help interpret biomedical data produced by high-throughput genomics and proteomics projects. The application of advanced computer models enabling the simulation of complex biological processes generates hypotheses and suggests experiments. Appropriately interfaced with biomedical databases, models are necessary for rapid access to, ...

متن کامل

Minding, OLAPing, and Mining Biological Data: Towards a Data Warehousing Concept in Biology

The considerable "algorithmic complexity" of biological systems requires a huge amount of detailed information for their complete description. High-throughput experiments (e.g., microarrays) are generating an overwhelming amount of data of biological systems at the molecular and cellular level. To adequately organize, analyze, and interpret this deluge of information will require new computatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 2015  شماره 

صفحات  -

تاریخ انتشار 2015